In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.
translated by 谷歌翻译
我们提出了一种奖励预测,基于模型的深度学习方法,具有轨迹约束的视觉关注,用于Mapless,本地视觉导航任务。我们的方法学会在潜伏图像空间中的位置放置视觉注意,这跟踪车辆控制动作引起的轨迹,以提高规划期间的预测精度。注意模型由任务特定的损耗和额外的轨迹约束损失共同优化,允许适应性令人鼓舞的正则化结构,以改善泛化和可靠性。重要的是,视觉注意力应用于潜在特征映射空间而不是原始图像空间,以促进有效的规划。我们在规划低湍流的视觉导航任务中验证了我们的型号,在越野设置和爬坡地区的越野设置和山坡上爬上山坡的轨迹。实验涉及随机程序生成的模拟和现实世界环境。与关注和自我关注替代方案相比,我们发现我们的方法改善了泛化和学习效率。
translated by 谷歌翻译
我们考虑如何通过鲁棒地计算其边界的缩回的在线过程直接提取最初未知的二维环境的路线图(也称为拓扑表示)。在本文中,我们首先在在线建设拓扑地图和执行控制法,以指导机器人到最近的未开发地区,首先介绍[1]。所提出的方法通过允许机器人在局部构造的地图上定位本身来操作,计算对环境(前沿)的未探究部分的路径,当机器人完全探索环境时计算稳健的终端条件,并实现环路闭合检测。所提出的算法导致机器人导航需求的平滑安全路径。所提出的方法是任何时间算法,其具有优点:它允许从激光扫描数据中获取激光扫描数据的主动创建拓扑映射。我们还提出了一种基于启发式的导航策略,其中机器人针对拓扑映射中的节点,该拓扑地图开放到空的空间。然后,我们通过呈现[1]中的工作,呈现一种利用特定光谱对应方法[2]的强度来扩展[1]的工作,以匹配从我们拓扑制作算法生成的映射环境。在这里,我们专注于实现一种可以使用AFF骨架来匹配映射环境的拓扑的系统。在两个给定地图和他们的AOF骷髅之间的拓扑匹配中,我们首先在两个不同环境的AFOF骨架上的点之间找到相应的通知。然后我们将环境的(2D)点对齐。我们还基于其提取的AOF骨架及其拓扑在两个给定的环境之间计算距离测量,作为对应点之间的匹配错误的总和。
translated by 谷歌翻译
Previous work has shown the potential of deep learning to predict renal obstruction using kidney ultrasound images. However, these image-based classifiers have been trained with the goal of single-visit inference in mind. We compare methods from video action recognition (i.e. convolutional pooling, LSTM, TSM) to adapt single-visit convolutional models to handle multiple visit inference. We demonstrate that incorporating images from a patient's past hospital visits provides only a small benefit for the prediction of obstructive hydronephrosis. Therefore, inclusion of prior ultrasounds is beneficial, but prediction based on the latest ultrasound is sufficient for patient risk stratification.
translated by 谷歌翻译
Applying deep learning concepts from image detection and graph theory has greatly advanced protein-ligand binding affinity prediction, a challenge with enormous ramifications for both drug discovery and protein engineering. We build upon these advances by designing a novel deep learning architecture consisting of a 3-dimensional convolutional neural network utilizing channel-wise attention and two graph convolutional networks utilizing attention-based aggregation of node features. HAC-Net (Hybrid Attention-Based Convolutional Neural Network) obtains state-of-the-art results on the PDBbind v.2016 core set, the most widely recognized benchmark in the field. We extensively assess the generalizability of our model using multiple train-test splits, each of which maximizes differences between either protein structures, protein sequences, or ligand extended-connectivity fingerprints. Furthermore, we perform 10-fold cross-validation with a similarity cutoff between SMILES strings of ligands in the training and test sets, and also evaluate the performance of HAC-Net on lower-quality data. We envision that this model can be extended to a broad range of supervised learning problems related to structure-based biomolecular property prediction. All of our software is available as open source at https://github.com/gregory-kyro/HAC-Net/.
translated by 谷歌翻译
In this paper, we propose a new short-term load forecasting (STLF) model based on contextually enhanced hybrid and hierarchical architecture combining exponential smoothing (ES) and a recurrent neural network (RNN). The model is composed of two simultaneously trained tracks: the context track and the main track. The context track introduces additional information to the main track. It is extracted from representative series and dynamically modulated to adjust to the individual series forecasted by the main track. The RNN architecture consists of multiple recurrent layers stacked with hierarchical dilations and equipped with recently proposed attentive dilated recurrent cells. These cells enable the model to capture short-term, long-term and seasonal dependencies across time series as well as to weight dynamically the input information. The model produces both point forecasts and predictive intervals. The experimental part of the work performed on 35 forecasting problems shows that the proposed model outperforms in terms of accuracy its predecessor as well as standard statistical models and state-of-the-art machine learning models.
translated by 谷歌翻译
In recent years several learning approaches to point goal navigation in previously unseen environments have been proposed. They vary in the representations of the environments, problem decomposition, and experimental evaluation. In this work, we compare the state-of-the-art Deep Reinforcement Learning based approaches with Partially Observable Markov Decision Process (POMDP) formulation of the point goal navigation problem. We adapt the (POMDP) sub-goal framework proposed by [1] and modify the component that estimates frontier properties by using partial semantic maps of indoor scenes built from images' semantic segmentation. In addition to the well-known completeness of the model-based approach, we demonstrate that it is robust and efficient in that it leverages informative, learned properties of the frontiers compared to an optimistic frontier-based planner. We also demonstrate its data efficiency compared to the end-to-end deep reinforcement learning approaches. We compare our results against an optimistic planner, ANS and DD-PPO on Matterport3D dataset using the Habitat Simulator. We show comparable, though slightly worse performance than the SOTA DD-PPO approach, yet with far fewer data.
translated by 谷歌翻译
It is known that neural networks have the problem of being over-confident when directly using the output label distribution to generate uncertainty measures. Existing methods mainly resolve this issue by retraining the entire model to impose the uncertainty quantification capability so that the learned model can achieve desired performance in accuracy and uncertainty prediction simultaneously. However, training the model from scratch is computationally expensive and may not be feasible in many situations. In this work, we consider a more practical post-hoc uncertainty learning setting, where a well-trained base model is given, and we focus on the uncertainty quantification task at the second stage of training. We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities, which is effective and computationally efficient. Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties and easily adapt to different application settings, including out-of-domain data detection, misclassification detection, and trustworthy transfer learning. We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications over multiple representative image classification benchmarks.
translated by 谷歌翻译
Convolutional neural networks (CNNs) are currently among the most widely-used neural networks available and achieve state-of-the-art performance for many problems. While originally applied to computer vision tasks, CNNs work well with any data with a spatial relationship, besides images, and have been applied to different fields. However, recent works have highlighted how CNNs, like other deep learning models, are sensitive to noise injection which can jeopardise their performance. This paper quantifies the numerical uncertainty of the floating point arithmetic inaccuracies of the inference stage of DeepGOPlus, a CNN that predicts protein function, in order to determine its numerical stability. In addition, this paper investigates the possibility to use reduced-precision floating point formats for DeepGOPlus inference to reduce memory consumption and latency. This is achieved with Monte Carlo Arithmetic, a technique that experimentally quantifies floating point operation errors and VPREC, a tool that emulates results with customizable floating point precision formats. Focus is placed on the inference stage as it is the main deliverable of the DeepGOPlus model that will be used across environments and therefore most likely be subjected to the most amount of noise. Furthermore, studies have shown that the inference stage is the part of the model which is most disposed to being scaled down in terms of reduced precision. All in all, it has been found that the numerical uncertainty of the DeepGOPlus CNN is very low at its current numerical precision format, but the model cannot currently be reduced to a lower precision that might render it more lightweight.
translated by 谷歌翻译
With water quality management processes, identifying and interpreting relationships between features, such as location and weather variable tuples, and water quality variables, such as levels of bacteria, is key to gaining insights and identifying areas where interventions should be made. There is a need for a search process to identify the locations and types of phenomena that are influencing water quality and a need to explain why the quality is being affected and which factors are most relevant. This paper addresses both of these issues through the development of a process for collecting data for features that represent a variety of variables over a spatial region, which are used for training and inference, and analysing the performance of the features using the model and Shapley values. Shapley values originated in cooperative game theory and can be used to aid in the interpretation of machine learning results. Evaluations are performed using several machine learning algorithms and water quality data from the Dublin Grand Canal basin.
translated by 谷歌翻译